SYSTEM AND METHOD FOR INDEXING AND TRACKING MULTIMEDIA STREAMS 

FOR WIRELESS MULTIMEDIA TRANSMISSION 



I. FIELD OF THE INVENTION 

[0001] The present invention relates generally to computer-based communication systems. 

H. BACKGROUND OF THE INVENTION 

[0002] Digital multimedia data such as video and music can be transmitted wirelessly to mobile 

receivers, such as wireless telephones, for playing of the multimedia by users of the mobile 

receivers. Such data typically may be broadcast. 

[0003] The multimedia can be formatted in accordance with Moving Pictures Expert Group 

(MPEG) standards such as MPEG-1, MPEG-2 (also used for DVD format), MPEG-4 and other 
block based transform codecs. Essentially, for individual video frames these multimedia 
standards use Joint Photographic Experts Group (JPEG) compression. In JPEG, the image of a 
single frame is typically divided into small blocks of pixels (usually 8x8 and/or 16x16 pixel 
blocks) that are encoded using a discrete cosine transform (DCT) function to transform the spatial 
intensity values represented by the pixels to spatial frequency values, roughly arranged, in a 
block, from lowest frequency to highest. Then, the DCT values are quantized, i.e., the 
information is reduced by grouping it into chunks by, e.g., dividing every value by 10 and 

i 

rounding off to the nearest integer. Since the DCT function includes a progressive weighting that 
puts bigger numbers near the top left corner of a block and smaller numbers near the lower right 
corner, a special zigzag ordering of values can be applied that facilitates further compression by 
run-length coding (essentially, storing a count of the number of, e.g., zero values that appear 
consecutively, instead of storing all the zero values). If desired, the resulting numbers may be 
used to look up symbols from a table developed using Huffman coding to create shorter symbols 
for the most common numbers, an operation commonly referred to as "variable length coding". 

[0004] In any case, a JPEG-encoded stream represents horizontal lines of a picture, in much the 

same way as the underlying pixel data is arranged in a matrix of horizontal rows. It should be 
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appreciated that while block based video encoding is being used as an example, the principals for 
indexing and tracking encompassed herein apply to other types of Audio and Video codecs and 
other forms of multimedia data. 

[0005] It will be appreciated that JPEG compression results in lost information. However, owing 

to the phenomenon of human perception and the way that the above process works, JPEG 
compression can reduce a picture to about one-fifth of its original size with virtually no 
discemable difference and to one -tenth of its original size with only slight degradation. 

[0006] Motion pictures add a temporal dimension to the spatial dimension of single pictures. 

Typical motion pictures have thirty frames, i.e., thirty still pictures, per second of viewing time. 
MPEG is essentially a compression technique that uses motion estimation to further compress a 
video stream. 

[0007] MPEG encoding breaks each picture into blocks called "macroblocks", and then searches 

neighboring pictures for similar blocks. If a match is found, instead of storing all of the DCT 
values for the entire block, the system stores a much smaller vector that-describes the movement 
(or not) of the block between pictures. In this way, efficient compression is achieved. 

[0008] With more specificity, MPEG compression in general uses three kinds of video frames. 

Naturally, some frames, referred to as "intraframes" (also referred to as "reference frames", or "I 
frames" and "information frames"), in which the entire frame is composed of compressed, 
quantized DCT values, must be provided (e.g., around once every two seconds). But in MPEG 
compression the remaining frames (e.g., 59) that make up the rest of the video for that second are 
very much smaller frames that refer to the intraframes, in accordance with MPEG compression 

principles. In MPEG parlance these frames are called "predicted" frames ("P frames") and 
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"bidirectional" frames ("B frames"), herein collectively referred to as "interframes". 
[0009] Predicted frames are those frames that contain motion vector references to the preceding 

intraframe or to a preceding predicted frame, in accordance with the discussion above. If a block 
has changed slightly in intensity or color, then the difference between the two frames is also 
encoded in a predicted frame. Moreover, if something entirely new appears that does not match 
any previous blocks, then a new block or blocks can be stored in the predicted frame in the same 
way as in an intraframe. Note that, as used herein, such a new block is not a "predetermined 
portion" of an intraframe in that it arises only upon the random introduction of a new object of 
arbitrary size and position in the frame. 
[0010] In contrast, a bidirectional frame is used as follows. The MPEG system searches forward 

and backward through the video stream to match blocks (typically the nearest intra frame or 
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predictive frame in each direction). Experience has shown that two bidirectional frames between 
each intra frame or predictive frame works well. 

[001 1] In any case, multimedia data including video that has I-frames, P-frames, and B-frames is 

typically transmitted using Internet Protocol (IP) packaging and transport techniques, which are 
compatible with almost all physical layers (i.e., with almost all physical transmission paths, wired 
or wireless, that might be used). While using IP has the benefit of compatibility with a large 
number of transmission systems, it can entail undesired overhead costs in certain applications, 
particularly in wireless applications, including requiring the wireless receiver to use more battery 
power than might be necessary to receive a desired program. 

[0012] The present invention makes this critical observation by recognizing the following. 

Several multimedia programs might be transmitted on a single channel. Because data in IP is 
transmitted in data chunks, the programs can be multiplexed with each other to attain more 
efficient use of bandwidth. For example, a data chunk might contain a relatively large I-frame of 
one program, a relatively small P-frame of a second program, and small B-frame of a third 
program, and so on, so that the data chunk is filled up as much as possible. Periodically (e.g., 
every few seconds), an index chunk is transmitted that indicates the order of the channels in the 
data chunks, so that a receiver wishing to receive a particular program knows which parts of the 
data chunks it should reconstitute into the desired program and which other parts of the chunks it 
can ignore as belonging to other programs. 
[0013] As currently implemented, the discrimination between desired program data and data 

pertaining to other programs is undertaken at the application level of the receiver. This means, 
however, that the underlying transport and physical layer components of the receiver - the radio 
portion, in wireless applications - must remain energized during reception of all program 
information in the received chunks, including information pertaining to other-than-desired 
programs. The present invention recognizes that such unnecessary power consumption looms 
large in the context of wireless receivers, which are typically powered by batteries. 

SUMMARY OF THE INVENTION 
[0014] A communication method for wireless transmission of a multimedia stream having data 

chunks that carry information related to at least first and second programs and global index 
chunks that carry indexing information related to the data chunks includes, in at least some data 
chunks, establishing respective index blocks. The index block of a data chunk for a program 
indicates times associated with subsequent data chunks for that program. The method also 
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includes, at a communication level other than an application level, using information in the index 
blocks to cause at least portions of a wireless receiver to be energized substantially only during 
times associated with a desired program and index blocks. 

[0015] In a preferred embodiment, the portions of the wireless receiver include an analog 

receiver. The index blocks can be the first blocks of the respective data chunks. An index block 
of a first chunk can indicate times for individual program information for at least the next data 
chunk for the same program following the first chunk. Or, an index block of a first chunk of a 
program can indicate times for at least the next two to N (2-N) data chunks for that program 
following the first chunk. The number of times indicated N may depend on the channel 
characteristics including but not limited to the typical anticipated error profile. If desired, the 
program information can pertain to resolution divisions in a program and/or to temporal layer 
divisions and/or quality or SNR layer divisions in a program. 

[00161 In another aspect, a wireless receiver of multimedia programs for displaying at least a 

desired one of the programs includes an analog receiver and a receiver controller receiving 

* » -. - * 

information in data chunks. The controller uses the information to cause the receiver to be 
energized substantially only during periods of reception of data pertaining to the desired one of 
the programs. 

[0017] In still another aspect, a transmitting system for transmitting a multimedia stream 

including data chunks carrying information pertaining to at least first and second programs and 
global index chunks includes means for establishing, in at least first and second data chunks, at 
least respective first and second timing blocks indicating times associated with respective 
subsequent data chunks pertaining to the programs. In addition, within each individual 
program's timing block, there may be additional index data indicating times associated with 
multimedia layer access points within the program's data chunk (e.g. indexes pointing to the 
audio portion, text portion etc.). With this feature, a receiving system can use the times to 
selectively energize a receiver of the receiving system. 

[0018] The details of the present invention, both as to its structure and operation, can best be 

understood in reference to the accompanying drawings, in which like reference numerals refer to 
like parts, and in which: 
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BRIEF DESCRIPTION OF THE DRAWINGS 
[0019] Figure 1 i$ a block diagram of the present system; 

[0020] Figure 2 is a schematic diagram of a multimedia stream according to the present 

invention; 

[0021] Figure 3 is a schematic diagram of a multimedia stream showing pointers to particular 

program portions; and 
[0022] Figure 4 is a flow chart of the overall logic of the invention. 

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT 
: [0023] Referring initially to Figure 1 , a system is shown, generally designated 10, that includes a 

wireless broadcast transmitting system having a transmitting processor 12 and conventional 
transmitting circuitry 14 which wirelessly broadcasts, using a preferably unidirectional channel 
16, digital multimedia content in the form of multimedia streams to wireless mobile station 
receivers 18 (only a single mobile station receiver 18 shown for clarity). 

[0024] The multimedia streams can be from one or more sources that communicate with or are 

otherwise associated with the broadcast transmitting system. The system 10 can use, without 
limitation, CDMA principles, GSM principles, or other wireless principles including wideband 
CDMA (WCDMA), cdma2000 (such as cdma2000 lx or 3x air interface standards, for example), 
i HDR, TDMA, or TD-SCDMA, and OFDM. The multimedia content can alternatively be 
provided oyer a bidirectional point-to-point link if desired, such as, e.g., a Bluetooth link or a 
■«:. 802.11 link or a CDMA link or GSM link. 

[0025] The preferred receiver 18 shown in Figure 1 includes an antenna 20 with conventional 

preprocessing (e.g., preamplifying) circuitry. The signal from the antenna 20 may be sent to an rf 
controller 22 that can include conventional circuitry for establishing gain, AGC, etc. in 
accordance with principles known in the art. The rf controller 22 can also include a state 
machine that implements the logic set forth herein to energize an analog rf receiver 24 only 
during periods when data that is associated with a user-desired program in a multimedia stream is 
present, as set forth further below. 

[0026] In accordance with wireless principles known in the art, the output of the rf receiver 24 is 

sent to a demodulator 26, which demodulates the signal using, e.g., a Fast Fourier Transform or 
other demodulating paradigm to render encoded data bits. The data bits are send to a decoder 28 
(e.g., convolutional, turbo, low-density parity-check) which decodes the bits and outputs a 
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decoded bit stream. The decoded bit stream may be provided to an offset generator 30 which, if 
provided, generates on and off signals that are received and used by the rf controller 22. Also, 
the output of the decoder 28 may be fed back to the RF controller 22. 

[0027] Also, the output of the decoder 28 is sent to additional processing circuitry 32. The 

preferred non-limiting additional processing circuitry can use an outer code, such as Reed- 
Solomon or other block cyclic codes (e.g., BCH) to further process the data, which can then be 
sent to a receiver system processor 34 for causing a multimedia program to be displayed on a 
display 36. The system processor 34 can incorporate the additional processing circuitry if 
desired. The user can indicate the desired program using an appropriate input device 37 that is 
part of the mobile station associated with the receiver 18. 

[0028] As shown in Figure 1 , when an offset generator 30 is provided, the decoder 28 can send it 

indexing and channel data tracking data derived from the decoded bit stream. If an offset 
generator 30 is not provided, the decoder 28 optionally can send on/off data direct to the rf 
controller 22. Or, the preferred rf controller 22 can obtain the on/off information directly from 
the incoming signal as set forth further below. It is understood that alternately the on/off data 
could come from the output of the demodulator 26 and sent either to the offset generator 30 or 
directly to the rf controller 22. 

[0029] Figure 2 shows the details of a multimedia stream 38 of the present invention. In 

accordance with multimedia transmission principles known in the art, several programs can be 
sent in the stream 38. In the example shown, prdgrams 1-4 are included in the stream. It is to be 
understood that while Figure 2 shows the programs 1-4 in numeric order, the programs need not 
necessarily be packaged in numerical order. As illustrated in Figure 2, the programs are 
multiplexed by filling some or all data chunks 40 (two data chunks 40 shown in Figure 2 for 
illustration), with portions 41 of data ("frames") in each data chunk 40 pertaining to each of the 
four exemplary programs in accordance with multiplexing principles known in the art. 

[0030] In addition to the data chunks 40, global index chunks 42 can be provided periodically 

(e.g., every two seconds) to indicate the order of program data appearance in the multimedia 
stream. In the example shown, the programs 40 appear in numerical order, i.e., the first portion 
of each data chunk 40 pertains to program #1 , the second portion to program #2, and so on. The 
global index chunks 42 may also provide metadata information, temporal location of the first byte 
of the first subsequent frame for each program (in the first following data chunk), and other 
programming information. 

[003 1 ] In accordance with the present invention, some or all data chunks 40 can include index or 
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timing block(s) 44. Preferably, each individual program frame 41 of each data chunk 40 includes 
an index block 44, as indicated in dashed lines in Figure 2. The index block 44 may be the first 
block of its frame 41. Each index block 44 contains what might be regarded as pointers to the 
temporal locations of the frames 41 of the same program in the subsequent data chunks 40, e.g., 
in the subsequent two to four data chunks, as indicated by the arrow 46 in Figure 2, although only 
the temporal location one program frame ahead might be indicated. For clarity of disclosure, 
Figure 2 shows only the pointers associated with program 1 , it being understood that programs 2- 
4 include index blocks with similar pointers. Thus, an index block 44 in a program 1 frame of a 
data chunk 40 points to the temporal locations of the index blocks of the next few program 1 
frames in subsequent data chunks 40, an index block 44 in a program 2 frame of a data chunk 40 
points to the temporal locations of the index blocks of the next few program 2 frames in 
subsequent data chunks 40, and so on. Likewise, the index block 44 in a program 1 frame of a 
second data chunk 40 points to the temporal locations of the index blocks of the next few 
program 1 frames in subsequent data chunks 40. Thus, once a receiver 18 has successfully 
acquired a program channel (eg program 1) it can continually find successive frames 41 for that 
program without referencing or receiving the subsequent global index chunks 42. The reason for 
having an index block 44 point to multiple program frames 41 is to compensate for the expected 
packet losses typical of lossy transmission systems such as wireless cellular systems. If one 
program frame 41 is corrupted by packet errors the receiver simply delays until the next 
scheduled data chunk 40 for the next occurrence of the program frame 41 . If the number of errors 
is larger than the total time represented by the latest index in the index list, the receiver simply 
returns to searching for the next global index chunks 42 and begins again. 
[0032] The index blocks 44 may also contain pointers 48 to the next index block associated with 

the program that is associated with a "program down" keystroke, so that when a user changes 
programs by depressing a "program down" button on the user input 37, the rf controller need 
simply shift to receiving the previous program in the stream. For example, a program 2 index 
block contains not only pointers to the next few program 2 index blocks, but also to the next 
instance of program 1 index block which appears in the next data chunk 40. In a four program 
example, a program 1 index block contains not only pointers to the next few program 1 index 
blocks, but also to the next program 4 index block. Should the user depress a "program up" 
button, the receiver simply switches to the very next program data in the stream by using the 

length of the current program as an offset to find the beginning of the next program (eg if the 



030237 

8 

current program was 2 the next program selected would be 3). After the user has selected a new 
program, the receiver can begin tracking that program by accessing its index block 44. In this way 

v 

a receiver 18 can increment and decrement through program channels without needing to read the 
global index chunks 42. 

[0033] Moreover, as shown in Figure 3 an index block 44 can contain not only pointers to 

subsequent program frame index blocks, but also pointers 49 to starting locations of particular 
program portions, such as various video layers and associated audio, multimedia portions of the 
program, etc., so that, if desired, the receiver can be energized only to receive video, audio, etc. 
Thus, an index block 44 may contain information pertaining to multimedia data divisions in a 
program (i.e., where a video 1 layer starts and where a video 2 later starts, and where Audio and 
other multimedia layers start).. 

[0034] In this way, the rf controller 22 is informed (directly from the received data or indirectly 

; from the offset generator 30 and/or decoder 28 and/or the demodulator 26) when the rf receiver 
24 must be energized to receive a particularly selected program/program portion/program layer, 
and to receive the next index block 44 associated with that program so that program data in, e.g., 
the next few, data chunks 40 can be received. 

[0035] , With the above disclosure in mind, attention is now directed to Figure 4. Commencing at 

block 50, the transmitter processor 12 inserts or otherwise establishes the index blocks 44 in the 
multimedia stream 38. The stream is transmitted wirelessly at block 52 and received by the 
mobile station at block 54. 

[0036] Proceeding to block 56, the receiver 24 is caused to be energized only during the times 

associated with the user-desired program/program portion/program layer, using the information 
in the index blocks 44. The receiver 24 is de-energized during other times (except for times 
associated with future index blocks 44). It may now be appreciated that the receiver 24 is 
controlled at the physical or transport layer and not at the application level. It can be appreciated 
that an alternative implementation could allow the application layer to control the receiver 24 but 
at the cost of additional delay and power consumption. 

[0037] It is to be further appreciated that the program information in the index blocks 44 can 

pertain to resolution divisions in a program, and/or to video layer divisions in a program. That 
means that lower resolution mobile stations, e.g., QCIF (Quarter Common Intermediate Format 
176x144 pixels) mobile stations, can cause their analog receiver to be energized only during 
periods of data reception required to support lower resolution display, whereas higher resolution 
mobile stations, e.g., CIF (Common Intermediate Format 352x288 pixels) mobile stations, can 
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cause their analog receiver to be energized during periods of data reception required to support 
higher resolution display. In addition to resolution divisions, the multimedia data could be 
separated into quality, temporal or other divisions depending on the needs of the system, 
program, preferences etc. 

[0038] Non-battery powered and/or wired systems may not benefit from battery power savings (though 
its always good to save power) but the BW and/or spectrum savings affored by the present 
invention would still apply and be useful to any multimedia delivery system, battery powered or 
not. 

[0039] While the particular SYSTEM AND METHOD FOR INDEXING AND TRACKING 

MULTIMEDIA STREAMS FOR WIRELESS MULTIMEDIA TRANSMISSION as herein 
shown and described in detail is fully capable of attaining the above-described objects of the 
invention, it is to be understood that it is the presently preferred embodiment of the present 
invention and is thus representative of the subject matter which is broadly contemplated by the 
present invention, that the scope of the present invention fully encompasses other embodiments 
which may become obvious to those skilled in the art, and that the scope of the present invention 
is accordingly to be limited by nothing other than the appended claims, in which reference to an 
element in the singular is not intended to mean "one and only one" unless explicitly so stated, but 
rather "one or more". Moreover, it is not necessary for a device or method to address each and 
every problem sought to be solved by the present invention, for it to be encompassed by the 
present claims. Furthermore, no element, component, or method step in the present disclosure is 
intended to be dedicated to the public regardless of whether the element, component, or method 
i step is explicitly recited in the claims. No claim element herein is to be construed under the 
provisions of 35 U.S.C. section 112, sixth paragraph, unless the element is expressly recited 
using the phrase "means for" or, in the case of a method claim, the element is recited as a "step" 
instead of an "act". 



